Nextflow is a domain specific language (DSL) implemented on top of the Groovy programming language, which in turn is a super-set of the Java programming language. This means that Nextflow can run any Groovy or Java code.

1 Printing Values

To print something is as easy as using one of the print or println methods.

Example:
groovy
println("Hello, World!")
Hello, World!

2 Variables

To define a variable, simply assign a value to it:

groovy
x = 1
println x

x = new java.util.Date()
println x
1
Wed Dec 06 19:56:51 IST 2023
groovy
x = new java.util.Date()
println x
Wed Dec 06 19:56:52 IST 2023
groovy
x = -3.1499392
println x
-3.1499392
groovy
x = false
println x
false
groovy
x = "Hi"
println x
Hi

Local variables are defined using the def keyword:

groovy
def x = 'foo'
println x
foo

The def should be always used when defining variables local to a function or a closure.

3 Data Types

3.1 Lists

A List object can be defined by placing the list items in square brackets:

groovy
list = [10, 20, 30, 40]

The items inside a list can be accessed using their index, list indexing begins at [0].

groovy
list = [10, 20, 30, 40]
println list[0]
println list.get(0)
10
10

The size method gives the length of a list.

groovy
list = [10, 20, 30, 40]
println list.size()
4

The assert keyword is to test if a condition is true (similar to an if function). Here, Groovy will print nothing if it is correct, else it will raise an AssertionError message.

groovy
list = [10, 20, 30, 40]
assert list[0] == 10

Lists can also be indexed with negative indexes and reversed ranges.

groovy
list = [0, 1, 2]
assert list[-1] == 2
assert list[-1..0] == list.reverse()

Info:
In the last assert line we are referencing the initial list and converting this with a “shorthand” range (..), to run from the -1th element (2) to the 0th element (0).

3.2 Maps

Maps are like lists that have an arbitrary key instead of an integer. Therefore, the syntax is very much aligned.

groovy
map = [a: 0, b: 1, c: 2]

Maps can be accessed in a conventional square-bracket syntax or as if the key was a property of the map.

groovy
map = [a: 0, b: 1, c: 2]
assert map['a'] == 0 
assert map.b == 1 
assert map.get('c') == 2 

To add data or to modify a map, the syntax is similar to adding values to a list:

groovy
map = [a: 0, b: 1, c: 2]
map['a'] = 'x' 
map.b = 'y' 
map.put('c', 'z') 
assert map == [a: 'x', b: 'y', c: 'z']

3.3 String interpolation

String literals can be defined by enclosing them with either single- ('') or double- ("") quotation marks.

groovy
foxtype = 'quick'
foxcolor = ['b', 'r', 'o', 'w', 'n']
println "The $foxtype ${foxcolor.join()} fox"

x = 'Hello'
println '$x + $y'
The quick brown fox
$x + $y

Info:
Note the different use of $ and ${..} syntax to interpolate value expressions in a string literal. The $x variable was not expanded, as it was enclosed by single quotes.

Finally, string literals can also be defined using the / character as a delimiter. They are known as slashy strings and are useful for defining regular expressions and patterns, as there is no need to escape backslashes. As with double-quote strings they allow to interpolate variables prefixed with a $ character.

Try the following to see the difference:

groovy
x = /tic\tac\toe/
y = 'tic\tac\toe'

println x
println y
tic\tac\toe
tic ac  oe

4 Control Structures

4.1 If statement

The if statement uses the same syntax common in other programming languages, such as Java, C, JavaScript, etc.

groovy
if (< boolean expression >) {
    // true branch
}
else {
    // false branch
}

The else branch is optional. Also, the curly brackets are optional when the branch defines just a single statement.

groovy
x = 1
if (x > 10)
    println 'Hello'

null, empty strings, and empty collections are evaluated to false.

Therefore a statement like:

groovy
list = [1, 2, 3]
if (list != null && list.size() > 0) {
    println list
}
else {
    println 'The list is empty'
}
[1, 2, 3]

Can be written as:

groovy
list = [1, 2, 3]
if (list)
    println list
else
    println 'The list is empty'
[1, 2, 3]

4.2 For Loops

The classical for loop syntax is supported as shown here:

groovy
for (int i = 0; i < 3; i++) {
    println("Hello World $i")
}
Hello World 0
Hello World 1
Hello World 2

Iteration over list objects is also possible using the syntax below:

groovy
list = ['a', 'b', 'c']

for (String elem : list) {
    println elem
}
a
b
c

5 Closures

In Groovy, the user defined function is called a closure.

6 Channels

Channels are a key data structure of Nextflow that allows the implementation of reactive-functional oriented computational workflows based on the Dataflow programming paradigm.

They are used to logically connect tasks to each other or to implement functional style data transformations.

6.1 Channel types

Nextflow distinguishes two different kinds of channels: queue channels and value channels.

6.1.1 Queue channel

A queue channel is an asynchronous unidirectional FIFO queue that connects two processes or operators.

  • asynchronous means that operations are non-blocking.

  • unidirectional means that data flows from a producer to a consumer.

  • FIFO means that the data is guaranteed to be delivered in the same order as it is produced. First In, First Out.

A queue channel is implicitly created by process output definitions or using channel factories such as Channel.of or Channel.fromPath.

6.1.2 Value channels

A value channel (a.k.a. singleton channel) by definition is bound to a single value and it can be read unlimited times without consuming its contents. A value channel is created using the value channel factory or by operators returning a single value, such as first, last, collect, count, min, max, reduce, and sum.

6.2 Channel factories

These are Nextflow commands for creating channels that have implicit expected inputs and functions.

6.2.1 value()

The value channel factory is used to create a value channel. An optional not null argument can be specified to bind the channel to a specific value. For example:

ch1 = Channel.value() 
ch2 = Channel.value('Hello there') 
ch3 = Channel.value([1, 2, 3, 4, 5]) 

6.2.2 of()

The factory Channel.of allows the creation of a queue channel with the values specified as arguments.

ch = Channel.of(1, 3, 5, 7)
ch.view { "value: $it" }

The first line in this example creates a variable ch which holds a channel object. This channel emits the values specified as a parameter in the of channel factory. Thus the second line will print the following:

value: 1
value: 3
value: 5
value: 7

The Channel.of channel factory works in a similar manner to Channel.from (which is now deprecated), fixing some inconsistent behaviors of the latter and providing better handling when specifying a range of values.

6.2.3 fromList()

The Channel.fromList channel factory creates a channel emitting the elements provided by a list object specified as an argument:

list = ['hello', 'world']

Channel
    .fromList(list)

6.2.4 fromPath()

The fromPath channel factory creates a queue channel emitting one or more files matching the specified glob pattern.

Channel.fromPath('./data/meta/*.csv')

This example creates a channel and emits as many items as there are files with a csv extension in the ./data/meta folder. Each element is a file object implementing the Path interface.

Tip:
Two asterisks, i.e. **, works like * but cross directory boundaries. This syntax is generally used for matching complete paths. Curly brackets specify a collection of sub-patterns.

6.2.5 fromFilePairs()

The fromFilePairs channel factory creates a channel emitting the file pairs matching a glob pattern provided by the user. The matching files are emitted as tuples, in which the first element is the grouping key of the matching pair and the second element is the list of files (sorted in lexicographical order).

Output
#!/usr/bin/env nextflow

// Channel with explicit values
ch = Channel.of(1, 3, 5, 7)
ch.view { "value: $it" }

// Channel from a list
list = ['hello', 'world']
Channel.fromList(list).view()

// Channel from a text file
Channel.fromPath('./bin/text_input.txt').splitText().view()

// Channel from file pairs matching a pattern
Channel.fromFilePairs('./data/reads/*_{1,2}.fastq.gz').view()
bash
nextflow run bin/example_channels.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/example_channels.nf` [prickly_gates] DSL2 - revision: b19d201e37
hello
world
value: 1
value: 3
value: 5
value: 7
ENSG00000157764

NM_001301717

chr17:43044295-43170245

RS123456

gene123

7 Operators I

Operators are methods that allow you to connect, transform values, or apply some user-provided rules.

7.1 view()

The view operator prints the items emitted by a channel to the console standard output, appending a new line character to each item. For example:

bash
cat bin/view_operator.nf
Channel
    .of('foo', 'bar', 'baz')
    .view()
bash
nextflow run bin/view_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/view_operator.nf` [festering_cantor] DSL2 - revision: 969a56608b
foo
bar
baz

7.2 map()

The map operator applies a function of your choosing to every item emitted by a channel and returns the items obtained as a new channel. The function applied is called the mapping function and is expressed with a closure as shown in the example below:

bash
cat bin/map_operator1.nf
Channel
    .of('hello', 'world')
    .map { it -> it.reverse() }
    .view()
bash
nextflow run bin/map_operator1.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/map_operator1.nf` [hungry_ptolemy] DSL2 - revision: 991adec947
olleh
dlrow

A map() can associate a generic tuple to each element and can contain any data.

bash
cat bin/map_operator2.nf
Channel
    .of('hello', 'world')
    .map { word -> [word, word.size()] }
    .view { word, len -> "$word contains $len letters" }
bash
nextflow run bin/map_operator2.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/map_operator2.nf` [spontaneous_jones] DSL2 - revision: 5d9f685614
hello contains 5 letters
world contains 5 letters

7.3 mix()

The mix operator combines the items emitted by two (or more) channels into a single channel.

bash
cat bin/mix_operator.nf
my_channel_1 = Channel.of(1, 2, 3)
my_channel_2 = Channel.of('a', 'b')
my_channel_3 = Channel.of('z')

my_channel_1
    .mix(my_channel_2, my_channel_3)
    .view()
bash
nextflow run bin/mix_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/mix_operator.nf` [sad_gautier] DSL2 - revision: d9ca7f7409
1
z
2
3
a
b

7.4 join()

The join operator creates a channel that joins together the items emitted by two channels with a matching key. The key is defined, by default, as the first element in each item emitted.

bash
cat bin/join_operator.nf
left = Channel.of(['X', 1], ['Y', 2], ['Z', 3], ['P', 7])
right = Channel.of(['Z', 6], ['Y', 5], ['X', 4])
left.join(right).view()
bash
nextflow run bin/join_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/join_operator.nf` [tiny_davinci] DSL2 - revision: 5951346a4b
[Y, 2, 5]
[X, 1, 4]
[Z, 3, 6]

7.5 combine()

The combine operator combines (cartesian product) the items emitted by two channels or by a channel and a Collection object (as right operand). COmbine returns a queue channel. For example:

bash
cat bin/combine_operator1.nf
numbers = Channel.of(1, 2, 3)
words = Channel.of('hello', 'ciao')
numbers
    .combine(words)
    .view()
bash
nextflow run bin/combine_operator1.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/combine_operator1.nf` [lonely_booth] DSL2 - revision: aa84e750a1
[1, hello]
[2, hello]
[3, hello]
[1, ciao]
[2, ciao]
[3, ciao]

A second version of the combine operator allows you to combine items that share a common matching key. The index of the key element is specified by using the by parameter (zero-based index, multiple indices can be specified as a list of integers). For example:

bash
cat bin/combine_operator2.nf
left = Channel.of(['A', 1], ['B', 2], ['A', 3])
right = Channel.of(['B', 'x'], ['B', 'y'], ['A', 'z'], ['A', 'w'])

left
    .combine(right, by: 0)
    .view()
bash
nextflow run bin/combine_operator2.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/combine_operator2.nf` [special_kalam] DSL2 - revision: f0fa879250
[B, 2, x]
[B, 2, y]
[A, 1, z]
[A, 3, z]
[A, 1, w]
[A, 3, w]

7.6 concat()

The concat operator allows you to concatenate the items emitted by two or more channels to a new channel. The items emitted by the resulting channel are in the same order as specified in the operator arguments.

In other words, given N channels, the items from the i+1 th channel are emitted only after all of the items from the i th channel have been emitted.

For example:

bash
cat bin/concat_operator.nf
a = Channel.of('a', 'b', 'c')
b = Channel.of(1, 2, 3)
c = Channel.of('p', 'q')

c.concat( b, a ).view()
bash
nextflow run bin/concat_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/concat_operator.nf` [serene_plateau] DSL2 - revision: aceedaf42d
p
q
1
2
3
a
b
c

7.7 count()

The count operator creates a channel that emits a single item: a number that represents the total number of items emitted by the source channel. For example:

bash
cat bin/count_operator.nf
Channel
    .of(9,1,7,5)
    .count()
    .view()
bash
nextflow run bin/count_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/count_operator.nf` [fabulous_stallman] DSL2 - revision: d3408f2eb5
4

7.8 ifEmpty()

The ifEmpty operator creates a channel which emits a default value, specified as the operator parameter, when the channel to which is applied is empty i.e. doesn’t emit any value. Otherwise it will emit the same sequence of entries as the original channel.

Thus, the following example prints:

bash
cat bin/ifempty_operator.nf
Channel .of(1,2,3) .ifEmpty('Hello') .view()
bash
nextflow run bin/ifempty_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/ifempty_operator.nf` [thirsty_stallman] DSL2 - revision: 188c99253a
1
2
3

7.9 toSortedList()

The toSortedList operator collects all the items emitted by a channel to a List object where they are sorted and emits the resulting collection as a single item. For example:

bash
cat bin/tosortedlist_operator.nf
Channel
    .of( 3, 2, 1, 4 )
    .toSortedList()
    .subscribe onNext: { println it }, onComplete: { println 'Done' }
bash
nextflow run bin/tosortedlist_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/tosortedlist_operator.nf` [disturbed_kowalevski] DSL2 - revision: 25e810eb00
[1, 2, 3, 4]
Done

7.10 unique()

The unique operator allows you to remove duplicate items from a channel and only emit single items with no repetition.

For example:

bash
cat bin/unique_operator.nf
Channel
    .of( 1, 1, 1, 5, 7, 7, 7, 3, 3 )
    .unique()
    .view()
bash
nextflow run bin/unique_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/unique_operator.nf` [determined_bernard] DSL2 - revision: 5722dca47a
1
5
7
3

7.11 take()

The take operator allows you to filter only the first n items emitted by a channel. For example:

bash
cat bin/take_operator.nf
Channel
    .of( 1, 2, 3, 4, 5, 6 )
    .take( 3 )
    .view()
bash
nextflow run bin/take_operator.nf
N E X T F L O W  ~  version 23.04.1
Launching `bin/take_operator.nf` [pedantic_torvalds] DSL2 - revision: dd5943541c
1
2
3